Scalable Structure Discovery in Regression using Gaussian Processes
نویسندگان
چکیده
Automatic Bayesian Covariance Discovery (ABCD) in Lloyd et al. (2014) provides a framework for automating statistical modelling as well as exploratory data analysis for regression problems. However ABCD does not scale due to its O(N) running time for the kernel search. This is undesirable not only because the average size of data sets is growing fast, but also because there is potentially more information in bigger data, implying a greater need for more expressive models that can discover finer structure. We propose Scalable Kernel Composition (SKC), a scalable kernel search algorithm, to encompass big data within the boundaries of automated statistical modelling.
منابع مشابه
Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes
Automating statistical modelling is a challenging problem that has far-reaching implications for artificial intelligence. The Automatic Statistician employs a kernel search algorithm to provide a first step in this direction for regression problems. However this does not scale due to its O(N) running time for the model selection. This is undesirable not only because the average size of data set...
متن کاملCovariance Kernels for Fast Automatic Pattern Discovery and Extrapolation with Gaussian Processes
Truly intelligent systems are capable of pattern discovery and extrapolation without human intervention. Bayesian nonparametric models, which can uniquely represent expressive prior information and detailed inductive biases, provide a distinct opportunity to develop intelligent systems, with applications in essentially any learning and prediction task. Gaussian processes are rich distributions ...
متن کاملScalable Gaussian Process Regression Using Deep Neural Networks
We propose a scalable Gaussian process model for regression by applying a deep neural network as the feature-mapping function. We first pre-train the deep neural network with a stacked denoising auto-encoder in an unsupervised way. Then, we perform a Bayesian linear regression on the top layer of the pre-trained deep network. The resulting model, Deep-Neural-Network-based Gaussian Process (DNN-...
متن کاملParametric Gaussian Process Regression for Big Data
This work introduces the concept of parametric Gaussian processes (PGPs), which is built upon the seemingly self-contradictory idea of making Gaussian processes parametric. Parametric Gaussian processes, by construction, are designed to operate in “big data” regimes where one is interested in quantifying the uncertainty associated with noisy data. The proposed methodology circumvents the welles...
متن کاملEnabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)
In applications of Gaussian processes where quantification of uncertainty is of primary interest, it is necessary to accurately characterize the posterior distribution over covariance parameters. This paper proposes an adaptation of the Stochastic Gradient Langevin Dynamics algorithm to draw samples from the posterior distribution over covariance parameters with negligible bias and without the ...
متن کامل